14 research outputs found

    SoK: Cryptographically Protected Database Search

    Full text link
    Protected database search systems cryptographically isolate the roles of reading from, writing to, and administering the database. This separation limits unnecessary administrator access and protects data in the case of system breaches. Since protected search was introduced in 2000, the area has grown rapidly; systems are offered by academia, start-ups, and established companies. However, there is no best protected search system or set of techniques. Design of such systems is a balancing act between security, functionality, performance, and usability. This challenge is made more difficult by ongoing database specialization, as some users will want the functionality of SQL, NoSQL, or NewSQL databases. This database evolution will continue, and the protected search community should be able to quickly provide functionality consistent with newly invented databases. At the same time, the community must accurately and clearly characterize the tradeoffs between different approaches. To address these challenges, we provide the following contributions: 1) An identification of the important primitive operations across database paradigms. We find there are a small number of base operations that can be used and combined to support a large number of database paradigms. 2) An evaluation of the current state of protected search systems in implementing these base operations. This evaluation describes the main approaches and tradeoffs for each base operation. Furthermore, it puts protected search in the context of unprotected search, identifying key gaps in functionality. 3) An analysis of attacks against protected search for different base queries. 4) A roadmap and tools for transforming a protected search system into a protected database, including an open-source performance evaluation platform and initial user opinions of protected search.Comment: 20 pages, to appear to IEEE Security and Privac

    Fully Homomorphic Encryption for RAMs

    Get PDF
    We initiate the study of fully homomorphic encryption for RAMs (RAM-FHE). This is a public-key encryption scheme where, given an encryption of a large database DD, anybody can efficiently compute an encryption of P(D)P(D) for an arbitrary RAM program PP. The running time over the encrypted data should be as close as possible to the worst case running time of PP, which may be sub-linear in the data size. A central difficulty in constructing a RAM-FHE scheme is hiding the sequence of memory addresses accessed by PP. This is particularly problematic because an adversary may homomorphically evaluate many programs over the same ciphertext, therefore effectively ``rewinding\u27\u27 any mechanism for making memory accesses oblivious. We identify a necessary prerequisite towards constructing RAM-FHE that we call \emph{rewindable oblivious RAM} (rewindable ORAM), which provides security even in this strong adversarial setting. We show how to construct rewindable ORAM using \emph{symmetric-key doubly efficient PIR (SK-DEPIR)} (Canetti-Holmgren-Richelson, Boyle-Ishai-Pass-Wootters: TCC \u2717). We then show how to use rewindable ORAM, along with virtual black-box (VBB) obfuscation for specific circuits, to construct RAM-FHE. The latter primitive can be heuristically instantiated using existing indistinguishability obfuscation candidates. Overall, we obtain a RAM-FHE scheme where the multiplicative overhead in running time is polylogarithmic in the data size NN. Our basic scheme is single-hop, but we also extend it to get multi-hop RAM-FHE with overhead NN^\epsilon for arbitrarily small >0\epsilon>0. We view our work as the first evidence that RAM-FHE is likely to exist

    Private Anonymous Data Access

    Get PDF
    We consider a scenario where a server holds a huge database that it wants to make accessible to a large group of clients. After an initial setup phase, clients should be able to read arbitrary locations in the database while maintaining privacy (the server does not learn which locations are being read) and anonymity (the server does not learn which client is performing each read). This should hold even if the server colludes with a subset of the clients. Moreover, the run-time of both the server and the client during each read operation should be low, ideally only poly-logarithmic in the size of the database and the number of clients. We call this notion Private Anonymous Data Access (PANDA). PANDA simultaneously combines aspects of Private Information Retrieval (PIR) and Oblivious RAM (ORAM). PIR has no initial setup, and allows anybody to privately and anonymously access a public database, but the server\u27s run-time is linear in the data size. On the other hand, ORAM achieves poly-logarithmic server run-time, but requires an initial setup after which only a single client with a secret key can access the database. The goal of PANDA is to get the best of both worlds: allow many clients to privately and anonymously access the database as in PIR, while having an efficient server as in ORAM. In this work, we construct bounded-collusion PANDA schemes, where the efficiency scales linearly with a bound on the number of corrupted clients that can collude with the server, but is otherwise poly-logarithmic in the data size and the total number of clients. Our solution relies on standard assumptions, namely the existence of fully homomorphic encryption, and combines techniques from both PIR and ORAM. We also extend PANDA to settings where clients can write to the database

    Multi-Key Searchable Encryption, Revisited

    Get PDF
    We consider a setting where users store their encrypted documents on a remote server and can selectively share documents with each other. A user should be able to perform keyword searches over all the documents she has access to, including the ones that others shared with her. The contents of the documents, and the search queries, should remain private from the server. This setting was considered by Popa et al. (NSDI \u2714) who developed a new cryptographic primitive called Multi-Key Searchable Encryption (MKSE), together with an instantiation and an implementation within a system called Mylar, to address this goal. Unfortunately, Grubbs et al. (CCS \u2716) showed that the proposed MKSE definition fails to provide basic security guarantees, and that the Mylar system is susceptible to simple attacks. Most notably, if a malicious Alice colludes with the server and shares a document with an honest Bob then the privacy of all of Bob\u27s search queries is lost. In this work we revisit the notion of MKSE and propose a new strengthened definition that rules out the above attacks. We then construct MKSE schemes meeting our definition. We first give a simple and efficient construction using only pseudorandom functions. This construction achieves our strong security definition at the cost of increasing the server storage overhead relative to Mylar, essentially replicating the document each time it is shared. We also show that high server storage overhead is not inherent, by giving an alternate (albeit impractical) construction that manages to avoid it using obfuscation

    Cryptography for Big Data Security

    Get PDF
    As big data collection and analysis becomes prevalent in today鈥檚 computing environments there is a growing need for techniques to ensure security of the collected data. To make matters worse, due to its large volume and velocity, big data is commonly stored on distributed or shared computing resources not fully controlled by the data owner. Thus, tools are needed to ensure both the confidentiality of the stored data and the integrity of the analytics results even in untrusted environments. In this chapter, we present several cryptographic approaches for securing big data and discuss the appropriate use scenarios for each. We begin with the problem of securing big data storage. We first address the problem of secure block storage for big data allowing data owners to store and retrieve their data from an untrusted server. We present techniques that allow a data owner to both control access to their data and ensure that none of their data is modified or lost while in storage. However, in most big data applications, it is not sufficient to simply store and retrieve one鈥檚 data and a search functionality is necessary to allow one to select only the relevant data. Thus, we present several techniques for searchable encryption allowing database- style queries over encrypted data. We review the performance, functionality, and security provided by each of these schemes and describe appropriate use-cases. However, the volume of big data often makes it infeasible for an analyst to retrieve all relevant data. Instead, it is desirable to be able to perform analytics directly on the stored data without compromising the confidentiality of the data or the integrity of the computation results. We describe several recent cryptographic breakthroughs that make such processing possible for varying classes of analytics. We review the performance and security characteristics of each of these schemes and summarize how they can be used to protect big data analytics especially when deployed in a cloud setting. We hope that the exposition in this chapter will raise awareness of the latest types of tools and protections available for securing big data. We believe better understanding and closer collaboration between the data science and cryptography communities will be critical to enabling the future of big data processing

    Conclave: secure multi-party computation on big data (extended TR)

    Full text link
    Secure Multi-Party Computation (MPC) allows mutually distrusting parties to run joint computations without revealing private data. Current MPC algorithms scale poorly with data size, which makes MPC on "big data" prohibitively slow and inhibits its practical use. Many relational analytics queries can maintain MPC's end-to-end security guarantee without using cryptographic MPC techniques for all operations. Conclave is a query compiler that accelerates such queries by transforming them into a combination of data-parallel, local cleartext processing and small MPC steps. When parties trust others with specific subsets of the data, Conclave applies new hybrid MPC-cleartext protocols to run additional steps outside of MPC and improve scalability further. Our Conclave prototype generates code for cleartext processing in Python and Spark, and for secure MPC using the Sharemind and Obliv-C frameworks. Conclave scales to data sets between three and six orders of magnitude larger than state-of-the-art MPC frameworks support on their own. Thanks to its hybrid protocols, Conclave also substantially outperforms SMCQL, the most similar existing system.Comment: Extended technical report for EuroSys 2019 pape
    corecore